Data Quality Management in a Database Cluster with Lazy Replication

نویسندگان

  • Cécile Le Pape
  • Stéphane Gançarski
  • Patrick Valduriez
چکیده

Abstract We consider the use of a database cluster with lazy replication. In this context, controlling the quality of replicated data based on users’ requirements is important to improve performance. However, existing approaches are limited to a particular aspect of data quality. In this paper, we propose a general model of data quality which makes the difference between “freshness” and “validity” of data. Data quality is expressed through divergence measures from the data with perfect quality. Users can thus specify the minimum level of quality for their queries. This information can be exploited to optimize query load balancing. We implemented our approach in our Refresco prototype. The results show that freshness control can help increase query throughput significantly. They also show significant improvement when freshness requirements are specified at the relation level rather than at the database level.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Snapshot Replication

An important technique to ensure the scalability and availability of clustered computer systems is data replication. This paper describes a new approach to data replication management called Robust Snapshot Replication. It combines an update anywhere approach (so updates can be evaluated on any replica, spreading their load) with lazy update propagation and snapshot isolation concurrency contro...

متن کامل

Database Replication: If You Must be Lazy, be Consistent

Due to severe performance penalties associated with synchronous replication, there is a significant interest in asynchronous replica management protocols. Lazy protocols currently in use either do not guarantee consistency and serializability as needed by transactional semantics or they impose restrictions on placement of data and which data object can be updated. In this paper we consider an a...

متن کامل

Replica Refresh Strategies in a Database Cluster

Relaxing replica freshness has been exploited in database clusters to optimize load balancing. However, in most approaches, refreshment is typically coupled with other functions such as routing or scheduling, which make it hard to analyze the impact of the refresh strategy itself on performance. In this paper, we propose to support routing-independent refresh strategies in a database cluster wi...

متن کامل

Fine-grained Refresh Strategies for Managing Replication in Database Clusters

Relaxing replica freshness has been exploited in database clusters to optimize load balancing. In this paper, we propose to support both routing-dependant and routing-independent refresh strategies in a database cluster with multi-master lazy replication. First, we propose a model for capturing refresh strategies. Second, we describe the support of this model in a middleware architecture for fr...

متن کامل

Preventive Multi-master Replication in a Cluster of Autonomous Databases

We consider the use of a cluster of PC servers for Application Service Providers where applications and databases must remain autonomous. We use data replication to improve data availability and query load balancing (and thus performance). However, replicating databases at several nodes can create consistency problems, which need to be managed through special protocols. In this paper, we presen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JDIM

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2005